Contact:
Peter-Paul de Wolf
Statistics Netherlands
P.O. Box 24500
2490 HA The Hague
The Netherlands
Phone: +31 70 337 5060
Last update: 10 Oct 2011
|
Research on tabular data (WP 3)Leading partner: StBA
Participating partners: StBA, TUIlm
Objectives
The work-package aims at providing methodology, expertise and software needed to
reach the overall goal of the proposal as concerning tabular data, which is to create
a suitable package to be established as standard tool for disclosure control of
aggregated data. In terms of project management it will be the objective of this
work-package to link the needs of the end-users to the software development as to
ensure a wide range of usability and user-friendliness for the resulting software
package.
Within WP 3 all tasks will be co-ordinated by StBa
Specific aims to address the overall goal of the work-package are the following:
Task 1
Refine and support the integration of the most desirable qualities and facilities of
existing software systems for tabular data protection into τ-ARGUS.
Software concept and design: Develop, propose and/or refine software concepts, design of
user-interfaces, supply methodological expertise (e.g. formulas, concepts and ideas
taken e.g. from other cell suppression packages). Work will be carried out by the
co-ordinator of WP 3 StBa in close co-operation with the co-ordinator (CBS) of
WP 4.2
Concept and design for a user-friendly cell suppression software with a wide range of
applicability in various situations, in particular to linked multiple and hierarchical
tables with special features to synchronise suppression patterns between tables which
are already published, and those which are intended to be newly released (e.g. customised
tables).
Task 2
Support the integration of the most recent version of the GHQUAR suppression algorithm
(c.f. sec. 5 (Innovation)), which will ensure wide applicability of the package to
(linked) tables of any size and complexity of structure.
Supply GHQUAR: support the implementation of the most recent version of GHQUAR
(applicable to linked tables and automated weighting functionality for control
of the selection of secondary suppressions); supply of expertise on this software,
minor modifications of the GHQUAR-software, in case they turn out to be necessary.
Work will be carried out by a subcontractor, e.g. the developer of the GHQUAR software.
Integration of a software suitable for selection of secondary suppressions in
multiple tables of any size and complexity of structure.
Task 3
Ensure practical significance of the research on optimisation problems arising in the
development of cell suppression algorithms based on linear programming to be carried out
as one of the tasks of WP 4.1.
Support for the research on optimisation problems: A library of ‘close-to-real-life’
test instances shall be developed and supplied to the OR-researchers involved in the
project (c.f. WP 4.1).
The problem set up of the optimisation problems (c.f. WP 4.1)
will be checked with assistance of an independent expert (i.e. an expert not involved
in the development of linear programming methodology for cell suppression within this
project) as well as the research progress. Work will be carried out by the co-ordinator
of WP 3 (StBa) in close co-operation with a subcontractor (‘independent’ OR expert)
and with all partners involved in WP 4.1.
Close collaboration providing assistance and feedback to the research partners involved
in the development of algorithms suitable for selection of secondary suppressions in
moderate sized hierarchical tables will reduce the risk that considerable amounts of
research are spend on problems beyond practical significance. This will yield software
keeping a good balance between quality (of the resulting suppression patterns in terms
of information loss due to suppression) and quantity (e.g. size of the tables, that the
software can be applied to efficiently
Task 4
Provide information on the performance of the various algorithms for secondary cell
suppression to be included in the final package. This information shall support the
transfer of the package and will as well be useful for guiding internal decisions to
be made within the software implementation work package WP 4.2.
Benchmarking: Any of the algorithms for selection of secondary suppressions implemented
into the package will be run on the set of tables from the test library (see (3) above).
Performances with respect to certain key issues (information loss in terms of number
and/or total value of suppressions, etc., computing time requirement) will be recorded.
Work will be carried out by a subcontractor (University OR-department).
Task 5
Maximise the information content of tables with suppressed entries, e.g. after a
suppression procedure has been carried out.
Table perturbation techniques: Development and implementation of algorithms to calculate
the upper and lower values which any suppressed value could have without violating
the constraints implied by the additive relationships within the table, perturbed values
to replace suppressed original cell entries. The perturbed values will be located between
the upper and lower bounds (s.a.), matching submarginals and marginals, thus implying
that table additivity will be maintained. Work will be carried out by a subcontractor (University OR-department).
The package will be able to calculate lower and upper bounds for suppressed entries,
that can be released safely, thus reducing the loss of information to the amount needed
to protect the sensitive cells. Instead of being presented suppressed entries or lower
and upper bounds for them, for many purposes data users may prefer a single value to
replace the suppressed entry: The software will offer perturbed values of the original
cell entries, which will maintain table additivity, and will not disclose individual
information either.
Task 6
Gain expertise with newly implemented facilities for control of the selection of
secondary suppressions. In particular the ‘European dimension’ of the secondary
cell suppression problem shall be addressed, e.g. how to ease and sustain approaches
of co-ordinating suppression patterns within Europe, as suggested e.g. by Eurostat for
application to data of the structural business survey (Doc. Eurostat/D2/SBS-T/NOV99/03).
Co-ordination of suppression patterns as a special application: Specific problems arise
when data are published on different levels of a regional classification (e.g. on the
national and on the supernational (EU) level, or on the regional and national level)
but secondary suppressions are to be assigned by different agencies actually (e.g.
NSI’s and Eurostat, or regional and national statistical institutes). This problem,
due to decentralised organisation of official statistics within Europe, will be tackled
using facilities of the software as implemented so far. Feasibility of several approaches
to improve the situation will be researched ( in particular the approach suggested by
Eurostat (c.f. (6) in the objectives section above) with a particular view on the
practicability of any methods suggested. The methods will be applied to several
real-life datasets, available on national as well as regional level. Methods turning
out to be promising will be supported by the software package, e.g. special options
shall be included in the software if necessary so. Work will be carried out by the
co-ordinator of WP 3 (StBa)
Expertise with newly implemented facilities for control of the selection of secondary
suppressions will be acquired and will be used to give assistance to users with special
needs, ensuring particularly the applicability of the package to Eurostat to the
extend possible for co-ordination of national suppression patterns. |